Skip to content

Fix crashes during interpreter shutdown on all Python versions#499

Open
nbouvrette wants to merge 1 commit intopython-greenlet:masterfrom
nbouvrette:fix/safe-getcurrent-during-finalization
Open

Fix crashes during interpreter shutdown on all Python versions#499
nbouvrette wants to merge 1 commit intopython-greenlet:masterfrom
nbouvrette:fix/safe-getcurrent-during-finalization

Conversation

@nbouvrette
Copy link
Contributor

@nbouvrette nbouvrette commented Mar 11, 2026

Summary

Fix multiple SIGSEGV crash paths during Py_FinalizeEx on all Python versions (3.10–3.14). Observed in production on ARM64 (Python 3.11 + uWSGI with max-requests worker recycling) where greenlet was installed as a transitive dependency but never explicitly used by application code.

Relationship to PR #495

PR #495 partially addressed this class of crashes by adding murder_in_place() and _Py_IsFinalizing() guards, but only on Python < 3.11 (#if !GREENLET_PY311). Further investigation revealed that:

  1. The vulnerability exists on all Python versions (3.10–3.14), not just < 3.11 — Py_IsFinalizing() is set after atexit handlers complete on every version.
  2. Multiple crash paths were unguardedgetcurrent(), type checkers (GreenletChecker, MainGreenletExactChecker, ContextExactChecker), and clear_deleteme_list() had no shutdown protection.
  3. PR Fix SIGSEGV/SIGABRT during interpreter shutdown on Python < 3.11 #495's tests were smoke tests, not regression tests — they pass on both pre-fix (greenlet 3.1.1) and post-fix (3.3.2) versions, meaning they cannot detect if the fix is reverted.

This PR supersedes PR #495 by making all guards unconditional, protecting the remaining crash paths, and adding TDD-certified tests that demonstrably fail on unpatched greenlet 3.3.2.

Design

Two independent guards now protect all shutdown phases:

  1. g_greenlet_shutting_down — an atexit handler registered at module init (LIFO = runs first) sets this flag. Covers the atexit phase of Py_FinalizeEx, where Py_IsFinalizing() is still False on all Python versions.

  2. Py_IsFinalizing() — covers the GC collection and later phases of Py_FinalizeEx. A compatibility shim is provided for Python < 3.13 (where only the private _Py_IsFinalizing() existed).

These guards are checked in mod_getcurrent, PyGreenlet_GetCurrent, GreenletChecker, MainGreenletExactChecker, ContextExactChecker, clear_deleteme_list(), ThreadState::~ThreadState(), _green_dealloc_kill_started_non_main_greenlet, and ThreadState_DestroyNoGIL::AddPendingCall.

Root cause

_Py_IsFinalizing() is only set after atexit handlers complete inside Py_FinalizeEx on all Python versions:

Py_FinalizeEx()
├── call_py_exitfuncs()              ← atexit phase (Py_IsFinalizing() == False)
│   └── g_greenlet_shutting_down     ← our flag covers this gap
├── _PyRuntimeState_SetFinalizing()  ← Py_IsFinalizing() becomes True
├── _PyGC_CollectIfEnabled()         ← GC phase (__del__ methods run here)
│   └── Py_IsFinalizing()           ← standard API covers this
├── finalize_interp_clear()          ← type objects freed here

Without the guards, code running in atexit handlers (e.g. uWSGI plugin cleanup calling Py_FinalizeEx) or __del__ methods could call greenlet.getcurrent(), reaching into partially-torn-down C++ state and crashing in PyType_IsSubtype via GreenletChecker.

What changed

C++ shutdown guards (8 files)

File Change
PyModule.cpp g_greenlet_shutting_down + atexit handler made unconditional (was #if !GREENLET_PY311)
CObjects.cpp PyGreenlet_GetCurrent guard made unconditional
PyGreenlet.cpp murder_in_place() guard made unconditional
TThreadState.hpp clear_deleteme_list() + destructor guards made unconditional
TThreadStateDestroy.cpp AddPendingCall guard extended with g_greenlet_shutting_down
greenlet.cpp Atexit handler registration made unconditional
greenlet_refs.hpp Added guards to GreenletChecker + ContextExactChecker
greenlet_internal.hpp Added guard to MainGreenletExactChecker

Additional hardening

  • clear_deleteme_list() uses std::swap (zero-allocation) instead of copying the PythonAllocator-backed vector
  • deleteme vector uses std::allocator (system malloc) instead of PyMem_Malloc
  • ThreadState uses std::malloc/std::free instead of PyObject_Malloc
  • clear_deleteme_list() preserves pending Python exceptions around its cleanup loop

Tests (3 files)

  • 5 new TDD-certified regression tests in test_interpreter_shutdown.py — verified RED on greenlet 3.3.2 (UNGUARDED) and GREEN with fix (GUARDED) across Python 3.10–3.14
  • 3 strengthened smoke tests — assert getcurrent() still returns valid objects when called before greenlet's cleanup (guards against over-blocking)
  • Updated file docstring and section headers — organized 21 tests into 4 documented groups
  • Fixed test_dealloc_catches_GreenletExit_throws_other — use sys.unraisablehook instead of stderr capture (pytest compatibility)
  • Fixed test_version — skip gracefully on old setuptools that can't parse PEP 639 SPDX license format

TDD verification

Ran both test types against unpatched greenlet 3.3.2 and the patched code across 6 Python versions:

Python greenlet 3.3.2 (RED) Patched (GREEN)
3.9 N/A (requires-python >= 3.10) N/A
3.10 UNGUARDED GUARDED (None)
3.11 UNGUARDED GUARDED (None)
3.12 UNGUARDED GUARDED (None)
3.13 UNGUARDED GUARDED (None)
3.14 UNGUARDED GUARDED (None)

PR #495's tests were also re-evaluated as part of this work: all 9 original tests pass on both greenlet 3.1.1 (pre-#495) and 3.3.2 (post-#495), confirming they are smoke tests that cannot detect regressions. The 5 Group D tests added here are the true regression safety net.

Additionally, the crash reproducer (uWSGI + Flask on ARM64 Python 3.11) ran 45,000 requests with 0 crashes (15 worker recycling cycles) with the patched greenlet.

Test plan

  • Full local test suite: 158 passed, 3 skipped, 0 failed (pytest)
  • TDD RED/GREEN verification across Python 3.10–3.14 via Docker
  • Crash reproducer: 45,000 requests, 0 segfaults on ARM64 Python 3.11
  • Behavioral review: murder_in_place() guard only fires during shutdown, not normal thread exit (Group B tests verify GreenletExit/finally still work)
  • Full CI on all supported Python versions

Backport note

These fixes have already been backported to the maint/3.2 branch in PR #500 (targeting 3.2.6), since the previous backport (3.2.5 / PR #495) did not fully stabilize shutdown behavior.

nbouvrette added a commit to nbouvrette/greenlet that referenced this pull request Mar 11, 2026
@nbouvrette nbouvrette force-pushed the fix/safe-getcurrent-during-finalization branch 2 times, most recently from 17d17e3 to 733a419 Compare March 11, 2026 05:37
nbouvrette added a commit to nbouvrette/greenlet that referenced this pull request Mar 12, 2026
Ports all crash fixes from the main branch (PR python-greenlet#499) to maint/3.2 for
a 3.2.6 release targeting Python 3.9 stability.

Three root causes of SIGSEGV during Py_FinalizeEx on Python < 3.11:

1. clear_deleteme_list() vector allocation crash: replaced copy with
   std::swap and switched deleteme_t to std::allocator (system malloc).

2. ThreadState memory corruption: switched from PythonAllocator
   (PyObject_Malloc) to std::malloc/std::free.

3. getcurrent() crash on invalidated type objects: added atexit handler
   that sets g_greenlet_shutting_down before _Py_IsFinalizing() is set.

Also fixes exception preservation in clear_deleteme_list(), adds
Py_IsFinalizing() compat shim for Python < 3.13, Windows USS tolerance
for flaky memory test, and additional shutdown tests.

Made-with: Cursor
@nbouvrette nbouvrette force-pushed the fix/safe-getcurrent-during-finalization branch from 25a4dfa to e1fdf27 Compare March 24, 2026 11:20
@nbouvrette nbouvrette changed the title Fix crash in getcurrent()/greenlet construction during early Py_FinalizeEx Fix crashes during interpreter shutdown on all Python versions Mar 24, 2026
During Py_FinalizeEx, multiple greenlet code paths accessed
partially-destroyed Python state, causing SIGSEGV in production
(uWSGI worker recycling on ARM64 and x86_64, Python 3.11).

Root cause: _Py_IsFinalizing() is set AFTER atexit handlers complete
on ALL Python versions, leaving a window where getcurrent() and type
validators reach into torn-down C++ state.

Fix: Two independent guards now protect all shutdown phases:

1. g_greenlet_shutting_down — atexit handler registered at module init
   (LIFO = runs first). Covers the atexit phase where
   Py_IsFinalizing() is still False.

2. Py_IsFinalizing() — covers the GC collection and later phases.
   A compatibility shim is provided for Python < 3.13.

These guards are checked in mod_getcurrent, PyGreenlet_GetCurrent,
GreenletChecker, MainGreenletExactChecker, ContextExactChecker,
clear_deleteme_list, ThreadState destructor,
_green_dealloc_kill_started_non_main_greenlet, and AddPendingCall.

Additional hardening:
- clear_deleteme_list() uses std::swap (zero-allocation)
- deleteme vector uses std::allocator (system malloc)
- ThreadState uses std::malloc/std::free
- clear_deleteme_list() preserves pending Python exceptions

TDD-certified: tests fail on greenlet 3.3.2 and pass with the fix
across Python 3.10-3.14. Test suite: 21 shutdown tests (5 TDD
regression, 2 behavioral, 14 smoke with 3 strengthened).

Also fixes:
- test_dealloc_catches_GreenletExit_throws_other: use
  sys.unraisablehook for pytest compatibility
- test_version: skip gracefully on old setuptools (PEP 639)
- test_no_gil_on_free_threaded: use getattr for pylint compatibility
- Flaky USS memory test on Windows

Made-with: Cursor
@nbouvrette nbouvrette force-pushed the fix/safe-getcurrent-during-finalization branch from a4a6510 to 5745a6c Compare March 24, 2026 11:47
nbouvrette added a commit to nbouvrette/greenlet that referenced this pull request Mar 24, 2026
Backport of PR python-greenlet#499 (master) to maint/3.2 for greenlet 3.2.6, with all
shutdown guards made unconditional across Python 3.9-3.13.

The previous backport (3.2.5 / PR python-greenlet#495) only guarded Python < 3.11,
but the vulnerability exists on ALL Python versions: Py_IsFinalizing()
is set AFTER atexit handlers complete inside Py_FinalizeEx.

Two independent guards now protect all shutdown phases:

1. g_greenlet_shutting_down — atexit handler registered at module init
   (LIFO = runs first). Covers the atexit phase where
   Py_IsFinalizing() is still False.

2. Py_IsFinalizing() — covers the GC collection and later phases.
   A compatibility shim maps to _Py_IsFinalizing() on Python < 3.13.

These guards are checked in mod_getcurrent, PyGreenlet_GetCurrent,
GreenletChecker, MainGreenletExactChecker, ContextExactChecker,
clear_deleteme_list, ThreadState destructor,
_green_dealloc_kill_started_non_main_greenlet, and AddPendingCall.

Additional hardening:
- clear_deleteme_list() uses std::swap (zero-allocation)
- deleteme vector uses std::allocator (system malloc)
- ThreadState uses std::malloc/std::free
- clear_deleteme_list() preserves pending Python exceptions

TDD-certified: tests fail on greenlet 3.3.2 and pass with the fix
across Python 3.10-3.14. Docker verification on Python 3.9 and 3.10
confirms GUARDED on the maint/3.2 branch.

Also fixes:
- SPDX license identifier: Python-2.0 -> PSF-2.0
- test_dealloc_catches_GreenletExit_throws_other: use
  sys.unraisablehook for pytest compatibility
- test_version: skip gracefully on old setuptools
- Flaky USS memory test on Windows

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant